Modeling the Training Iteration Time for Heterogeneous Distributed Deep Learning Systems

نویسندگان

چکیده

Distributed deep learning systems effectively respond to the increasing demand for large-scale data processing in recent years. However, significant investment building distributed with powerful computing nodes places a huge financial burden on developers and researchers. It will be good predict precise benefit, i.e., how many times of speedup it can get compared training single machine (or few), before actually such big systems. To address this problem, paper presents novel performance model iteration time heterogeneous based characteristics parameter server (PS) system bulk synchronous parallel (BSP) synchronization style. The accuracy our is demonstrated by comparing real measurement results TensorFlow when different neural networks various kinds hardware testbeds: prediction higher than 90% most cases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep learning-based CAD systems for mammography: A review article

Breast cancer is one of the most common types of cancer in women. Screening mammography is a low‑dose X‑ray examination of breasts, which is conducted to detect breast cancer at early stages when the cancerous tumor is too small to be felt as a lump. Screening mammography is conducted for women with no symptoms of breast cancer, for early detection of cancer when the cancer is most treatable an...

متن کامل

Deep Learning for Time Series Modeling

Demand forecasting is crucial to electricity providers because their ability to produce energy exceeds their ability to store it. Excess demand can cause “brown outs,” while excess supply ends in waste. In an industry worth over $1 trillion in the U.S. alone [1], almost 9% of GDP [2], even marginal improvements can have a huge impact. Any plan toward energy efficiency should include enhanced ut...

متن کامل

Homomorphic Parameter Compression for Distributed Deep Learning Training

Distributed training of deep neural networks has received significant research interest, and its major approaches include implementations on multiple GPUs and clusters. Parallelization can dramatically improve the efficiency of training deep and complicated models with large-scale data. A fundamental barrier against the speedup of DNN training, however, is the trade-off between computation and ...

متن کامل

Deep Symbolic Representation Learning for Heterogeneous Time-series Classification

In this paper, we consider the problem of event classification with multi-variate time series data consisting of heterogeneous (continuous and categorical) variables. The complex temporal dependencies between the variables combined with sparsity of the data makes the event classification problem particularly challenging. Most state-of-art approaches address this either by designing hand-enginee...

متن کامل

Adaptive Distributed Consensus Control for a Class of Heterogeneous and Uncertain Nonlinear Multi-Agent Systems

This paper has been devoted to the design of a distributed consensus control for a class of uncertain nonlinear multi-agent systems in the strict-feedback form. The communication between the agents has been described by a directed graph. Radial-basis function neural networks have been used for the approximation of the uncertain and heterogeneous dynamics of the followers as well as the effect o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Intelligent Systems

سال: 2023

ISSN: ['1098-111X', '0884-8173']

DOI: https://doi.org/10.1155/2023/2663115